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ABSTRACT 

The Open Multitrack Testbed is an online repository of mul- 
titrack audio accessible to the public, with rich metadata an- 
notation, a semantic database and search functionality. Two 
years after it first went live, the dataset is the largest and 
most diverse available, and still growing. An overview of 
the available content, some prominent features, and exam- 
ple uses in the field of intelligent music production are dis- 
cussed. 
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Figure 1 : Browse interface screenshot 

1. INTRODUCTION 


A large part of music production research is concerned with 
the analysis and manipulation of multitrack audio. As a con- 
sequence, there is a need for a large number of multitrack 
recordings for investigating recording and mixing practices, 
evaluating algorithms, and demonstrating new ideas. How- 
ever, multitrack content is scarce, in part due to licensing is- 
sues. To address this, we have created the Open Multitrack 
Testbed [ 1 ], a collection of annotated multitracks with an as- 
sociated website (multitrack . eecs . qmul . ac . uk). 

In this context, a multitrack audio item, or song , is de- 
fined as a set of more than two streams (or tracks) of audio 
which are meant to be played alongside each other. In ad- 
dition to these tracks , some songs also contain mixes (pro- 
cessed sums of the raw tracks ) and stems (processed sums 
of a subset of these tracks , e.g. only the drum parts). 


time of writing, it contains close to 600 songs , of which 
some have up to 300 individual constituent tracks from sev- 
eral takes , and others up to 400 mixes of the same source 
content. 

A wide range of metadata is supported, and included to 
the extent that it is available for the different items. Us- 
ing established knowledge representation methods such as 


the Music Ontology [2] and the Studio Ontology [3|, song 


attributes include title, artist, license, composer, and record- 
ing location; track attributes include instrument, microphone, 
sampling rate, number of channels, and take number; and 
mix attributes include mixing engineer, audio render for- 
mat, and digital audio workstation (DAW) name and ver- 
sion. These properties can be used to search, filter and 
browse the content to find the desired audio. 


2. FEATURES 


4. USE CASES 


To quickly find suitable content, the web application in- 
cludes browse and search functionality (Figures [T] and [2]), 
to allow filtering and searching using the various metadata 
properties. The metadata associated with different songs, 
stems, mixes and tracks (Figure [3]) is visualised within the 
application, and each item can be downloaded separately. 

The database offers a SPARQL endpoint to query and 
insert data through HTTP requests. The infrastructure fur- 
ther supports user accounts and different levels of access, 
for instance when licenses are less liberal, and a convenient 
metadata input interface. 

3. CONTENT 

Launched in 2014, the Testbed’s initial collection was taken 
from an internal dataset of multitrack audio content at the 
Centre for Digital Music, and it is still being continually 
expanded with locally and remotely hosted content. At the 


With a dataset of this size and diversity, and such a wide 
range of metadata available, the testbed can be and has been 
used for various research topics including audio analysis 
[4 1, training and testing machine learning models 1 5 1 and 
analysis of music production practices [6 1. 

A number of other multitrack audio resources exist, but 
they contain a smaller number of items, are less diverse, 
have ambiguous or restricted licensing, and/or provide lit- 
tle or no metadata. Furthermore, the Testbed uniquely has 
a number of songs with several mixes including DAW files 
containing all parameter settings [7|. Where licensing al- 
lows it, the resources are mirrored within the Testbed. For 
unclear or less liberal licenses, the metadata is still added to 
the database, but links point to third party websites. 

Researchers, journals, conferences and funding bodies 
increasingly prefer data to be open, as it allows reproduction 
and extension of results. The Testbed facilitates widespread 
usage of a single, but large and diverse dataset, allowing for 
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Figure 2: Search interface screenshot 


Track 1 accordion Track 2 acoustic guitar Track 3 acoustic guitar Track 4 acoustic guitar Track 5 bass guitar Track 6 floor tom 
Track 7 male backing vocal Track 8 male backing vocal Track 9 male backing vocal Track 10 male backing vocal Track 11 hi-hat 
Track 12 female backing vocal Track 13 keyboard Track 14 kick drum Track 15 kick drum 
Track 18 electric guitar Track 19 electric guitar Track 20 finger snaps Track 21 snare drum 
Track 24 overhead drums Track 25 tom 


Track 16 male lead vocal 
Track 22 snare drum 


Track 17 electric guitar 


Track 23 overhead drums 


Index: 17 

Instrument: electric guitar 

Number of channels: 1-Mono 
Microphone: AEA R84 
Processor: 

Preamplifier: 

Converter: 

Sampling rate (Hz): 96000 
Bit depth: 24 
File size (MB): 62.61 
Take number: 

File name testbed: http://c4dm.eecs.qmul.ac.uk/multitrack/TheDoneFors/LeadMe/Raw/Paul%20GTR%20R84.wav 
archive.org mirror link: https://archive.org/download/LeadMe_201402/Paul%20GTR%20R84.wav 


Figure 3: Track view screenshot 


instance to compare different algorithms with the same data. 

The authors highly welcome any use of and contribu- 
tions to the Testbed. 
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